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Introduction 


***This  final  report  is  largely  identical  to  the  previous  Annual  Report.  This  is  due 
to  the  facts  that  this  was  a  one  year  hypothesis  development  grant  and  the 
majority  of  the  work  was  completed  in  the  year  period.  This  final  report  will 
therefore  include  the  previously  reported  information  as  well  as  the  additional 
studies  included  in  the  6  month  NCE  of  the  grant.  Please  note  that  the  additional 
text  will  be  in  TIMES  NEW  ROMAN  font  in  order  to  be  distinguished  as  new 
material. 


There  were  two  primary  goals  of  this  one-year  Hypothesis  Development  Award:  Identify 
genes  whose  alternative  cleavage  and  polyadenylation  (APA)  may  be  altered  during  the 
acquisition  of  androgen-independence  and  to  determine  if  H3K36  methylation  changes  regulate 
this  event.  This  document  will  describe  the  results  of  our  efforts  and  what  we  conclude  from 
these  results.  As  shown  below,  we  found  no  evidence  for  a  role  for  H3K36  methylation  changes 
in  the  regulation  of  APA  but  did  discover  a  key  factor  that  does  regulate  APA  and  its  expression 
is  reduced  in  androgen  receptor  negative  cells.  Moreover,  we  have  conducted  RNA-Seq  to 
identify  targets  of  this  factor  and  found  androgen  signaling  to  be  a  major  pathway  affected  upon 
depletion  of  this  protein.  In  this  introduction,  we  provide  a  brief  background  on  the  two  key 
points  of  the  proposal  to  better  facilitate  review  of  the  body  and  research  accomplishments. 


AAAAAAA3' 


Alternative  Cleavage  and  Polyadenylation.  Recent  large-scale  Deep  Sequencing  analyses 
have  found  evidence  for  extensive  global  diversity  of  poly(A)  site  choice  as  an  organism 
progresses  through  development  or  when  quiescent  T  cells  are  activated  using  T  cell  antigen 
receptor  (1-3).  More  importantly,  these  same  changes  in  poly(A)  site  choice  are  observed  as 
cells  undergo  transformation  (4).  The 

underlying  trend  in  all  of  these  studies  is  that  an  - ’’1*  mcroRNA  binding  sues 

increased  cell  proliferation  capacity  (i.e.  during 
tumorigenesis)  is  associated  with  a  global  shift 
from  the  dPAS  to  a  pPAS,  which  predictably 
gives  rise  to  more  stable  messages  that  are 
resistant  to  microRNA  repression  (Figure  1). 

This  appears  to  be  a  general  property  of  cancer 
cells  as  many  different  cancer  types  were  tested 
for  proximal  PAS  usage  of  a  few  select  genes 
and  many  were  found  to  have  increased  usage  of  these  sites  (4).  We  hypothesized  that  genes 
vital  to  acquisition  of  androgen  independence  during  prostate  cancer  tumor  progression  are 
regulated  by  APA.  This  was  the  goal  of  Aim  1 . 


Figure  1.  Shifting  the  polyA  tail  site  to  a  proximal 
position  (pPA)  would  result  in  a  loss  of  microRNA 
inhibition  of  expression. 


FCffil  in  in*s*r>chym«l  c*IU 


H3K36  methylation.  Since  the  hypothesis  of  the  histone  code  by  Allis  and  colleagues,  it  has 

become  clear  that  modifications  of  histones  can  have 
profound  effects  on  gene  expression.  While  there  have 
been  a  myriad  of  studies  on  H3K36  in  particular,  the 
one  most  relevant  to  this  research  proposal  was  by 
Misteli  and  colleagues(5,  6).  They  observed  that  in 
prostate  cancer  cells  that  undergo  epithelial  to 
mesenchymal  transition,  H3K36  is  trimethylated  by 
Setd2.  This  trimethylation  leads  to  recruitment  of  a 
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Figure  2.  Setd2  activity  chauges  result  iu  altered  K36 
trimethylatiou.  Altered  K36me3  results  iu  differeutial 
recruitmeut  of  spliciug  factor  PTB  to  cause  differeut 
amouuts  of  exou  iuclusiou.  (Figure  from  Waguer  EJ 
&  Carpeuter,  2012) 
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reader  protein  that  ultimately  recruits  a  splicing  regulator,  PTB  (Figure  2).  There  has  been  no 
link  established,  thus  far,  between  histone  methylation  status  and  alternative  cleavage  and 
polyadenylation  but  it  appears  plausible,  l/l/e  hypothesized  that  alterations  in  the  activity  of 
Setd2  may  change  H3K36  methylation  status  in  nucleosomes  near  the  3’  ends  of  genes 
regulating  APA.  This  was  the  goal  of  Aim  2. 

Body 

In  this  section,  we  describe  both  positive  and  negative  results  acquired  during  the  one-year 
funding  period.  The  two  specific  aims  were  conducted  concurrently  but,  for  the  ease  of  reading 
and  presentation,  we  present  the  results  of  Specific  Aim  2  first. 

Specific  Aim  2.  This  goal  of  this  aim  was  to  determine  the  position  of  H3K36me  at  the  3’  end  of 
genes  identified  as  undergoing  APA  and  test  the  effect  of  Setd2  expression  on  APA. 

Aim  2b.  Modulate  the  expression  level  of  Setd2  and  determine  the  impact  on  APA  using 

realtime  PCR  analysis. 

To  commence  with  this  aim,  we  had  to  develop  a  real-time  PCR  assay  on  gene(s) 
undergoing  APA  that  Setd2  may  regulate.  To  this  end,  we  devised  an  approach  focusing  on 
three  genes  well-known  to  undergo  APA:  cyclin 
D1,  Timp2,  and  Dicerl.  This  approach  would 
include  designing  amplicons  specific  to  the 
mRNA  using  the  distal  polyA  site  choice  as 
well  as  a  common  site  to  measure  both  distal 
and  proximal.  To  do  this,  we  tested  a  battery 
of  PCR  primers  looking  for  pairs  that  would 
generate  a  single  band  on  an  agarose  gel  with 
a  clean  melting  curve  profile  on  a  realtime  PCR 
machine.  As  is  seen  in  Figure  3,  we 
successfully  generated  the  amplicons  for  these 
three  genes.  Importantly,  these  amplicons 
measure  the  APA  status  of  the  endogenous 
genes  so  no  reporter  system  was  necessary. 


To  downregulate  the  expression  of  Setd2,  we  designed  and  purchased  siRNA  specific  to 
human  Setd2.  One  of  these  siRNA  was  already  validated  to  give  >90%  knockdown.  Treatment 
of  cells  with  these  siRNA  resulted  in  essentially  no  change  in  the  level  of  APA  usage  for  all  three 
test  genes  (Figure  4,  only  cyclin  D1  is  shown).  This  is  evidenced  by  the  final  column  of  the 
data  demonstrating  that  the  log  difference  in  distal  relative  to  common  signal  in  both 
knockdowns  is  very  similar  to  the  control  siRNA  treated  cells  (C2). 


Cyclin  Dl 

dCT  (minus  7SK) 

Common 

Corr  comn  Distal  7SK 

Common 

Distal  Diff 

Log  diff. 

C2 

26.52 

27.34 

27.89 

12.73 

C2 

14.61 

15.16 

0.55 

1.464086 

C2 

26.68 

27.5 

27.27 

11.88 

C2 

15.62 

15.39 

-0.23 

0.852635 

C2 

26.69 

27.51 

27.63 

11.84 

C2 

15.67 

15.79 

0.12 

1.086735 

Set2  sil 

26.87 

27.69 

26.77 

12.84 

Set2sil 

14.85 

13.93 

-0.92 

0.528509 

Set2  sil 

25.65 

26.47 

26.7 

13.05 

Set2sil 

13.42 

13.65 

0.23 

1.172835 

Set2  sil 

28.06 

28.88 

29.53 

12.71 

Set2sil 

16.17 

16.82 

0.65 

1.569168 

Set2  si2 

27.65 

28.47 

28.67 

13.04 

Set2  si2 

15.43 

15.63 

0.2 

1.148698 

Set2si2 

25.78 

26.6 

26.03 

12.89 

Set2  si2 

13.71 

13.14 

-0.57 

0.673617 

Set2  si2 

26.86 

27.68 

27.31 

12.03 

Set2  si2 

15.65 

15.28 

-0.37 

0.773782 

Figure  4.  Real-time  PCR  analysis  of  proximal  versus  distal  choice  of  cyclin  D1  mRNA.  The  values  are  Cts 
for  each  of  the  amplicons  shown  in  figure  3  for  cyclin  Dl.  C2  represents  the  negative  control  siRNA 


Figure  3.  Amplicon  design  for  three  test 
genes  known  to  have  alternative  cleavage 
and  polyadenylation. 
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While  we  were  confident  of  the  technique  and  the  ability  of  the  siRNA  to  deplete  Setd2,  we 
sought  to  develop  a  positive  control  siRNA  targeting  a  protein  that  modulated  APA.  Systematic 
testing  of  the  15  known  polyA  processing  factors  led  us  to  focus  our  studies  on  CFIm25. 
Depletion  of  this  protein  from  cells  led  to  enhanced  usage  of  the  proximal  polyA  site,  which  is 


AR+  AR- 


Figure  5.  Western  Blot  analysis  from  prostate  cancer 
cell  line  lysates  that  are  either  androgen  receptor 
positive  or  negative. 


what  we  anticipated,  would  happen  after 
depletion  of  Setd2.  What  makes  CFIm25  so 
interesting  in  prostate  cancer  is  the 
observation  that  we  made  demonstrating  that 
CFIm25  is  downregulated  in  prostate  cancer 
cell  lines  that  are  androgen-receptor  negative 
(Figure  5).  This  result  was  followed  up  in 
Specific  Aim  1  (see  below). 


Aim  2a.  Perform  ChIP  assays  and  real  time  PCR  to  map  the  position  of  H3K36me. 

To  conduct  the  ChIP  assay,  we  designed  a  large  set  of  amplicons  throughout  the  cyclin  D1 
gene  body,  its  promoter,  and  then  downstream.  The  quality  of  the  amplicons  is  of  “ChIP-grade” 
as  can  be  seen  in  Figure  6  (agarose  gels).  We  performed  ChIP  assays  using  two  control 
antibodies  raised  to  RNA  polymerase  II  (RNA  Pol2)  and  histone  H3.  We  observed  significant 
histone  H3  signal  at  the  promoter  and  then  again  at  the  distal  polyA  site.  The  graph  shown  is 
representative  of  multiple  experiments.  We  were  not  able  to  get  an  H3K36me3  antibody  to  work 
on  this  gene  and  trouble  shooting  was  not  continued  given  that  the  results  from  Specific  Aim  2b 
did  not  suggest  that  Setd2  was  functioning  on  this  gene. 
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Figure  6.  Upper  panels,  agarose  gel 
staining  of  amplicons  throughout 
the  cyclin  D1  gene.  The  schematic 
directly  below  the  gels  represents 
the  gene  structure  of  the  cyclin  D1 
gene  and  the  relative  position  of  the 
amplicons.  The  graph  is  the  result 
of  realtime  PCR  of  ChIP  samples 
and  are  plotted  as  percent  of  input. 
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Specific  Aim  1 .  The  goal  of  this  aim  is  to  perform  deep  sequencing  of  mRNA  isolated  from 
different  prostate  cancer  cell  lines.  Within  this  dataset  we  hope  to  identify  novel  instances  of 
APA. 

Aim  1  a.  Isolate  total  RNA,  enrich  for  mRNA,  perform  3’  tagging  on  mRNA,  and  submit  for  deep 

sequencing. 

This  specific  aim  gave  us  the  most  trouble  as,  at  the  time  of  the  grant  submission,  there  was 
no  developed  protocol  describing  how  to  do  3’  tagging  to  sequence  polyA  site  tail  junctions. 
Fortunately,  a  paper  was  published  describing  a  new  technology  called  “3PSeq”  which  was 
exactly  the  technique  that  we  needed.  We 
collaborated  with  a  local  Deep  sequencing 
company,  called  LC  Sciences,  to  develop 
3PSeq.  This  technique  is  designed  to 
partition  down  mRNA  to  just  the  few  bases 
prior  to  the  site  of  polyA  tail  addition  as  well 
as  a  few  nucleotides  into  the  polyA  tail.  As 
is  seen  in  Figure  7,  we  were  able  to  get 
sequences  containing  polyA  tails  at  the 
expected  positions;  however,  the  level  of 
sequence  depth  was  not  enough  to  make 
actual  measurements. 


sequence  (*mRNAstrand) 

translated  protein 

TTTTTAGTTGTCTAAATAAAATGCCTCTAAAACAAAAAAA 

STRAP 

GTTTTTAGTTGTCTAAATAAAATGCaaAAAACAAA  A  AA 

STRAP 

CCTGCCaaCCCTGAAATAAAGAACAGCTTGACAiAAAA 

RPL36 

CCaGTTCTGGACATTTCATATAAATGGAATCACACAAAA 

CBS 

CCTGTTCTGGACATTTCATATAAATGGAATCACACAAAAA 

CBS 

aGTTaGGACATTTCATATAAATGGAATCACACAAAAAA 

CBS 

atggaccagtcaaataaaagccttcaggcccctcaaaaaa 

NDUFB7 

Figure  7.  Example  sequencing  reads  from  the 
3PSeq  experiment.  The  polyA  sites  are  labeled  in 
green  and  the  polyA  tail  is  in  red. 


In  light  of  these  results  and  the  fact  that  we  observed  CFIm25  to  be  downregulated  in  AR- 
cell  lines,  we  chose  to  conduct  standard  RNA-seq  under  conditions  of  CFIm25  RNAi  versus 
control  RNAi.  The  goal  is  then  to  use  bioinformatics  to  filter  only  the  3’  end  reads  to  identify 
instances  of  APA  that  is  regulated  by  CFIm25.  Once  these  instances  are  identified,  we  would 
then  use  qRT-PCR  to  measure  the  APA  status  of  these  genes  in  AR+  versus  AR-  prostate  cells. 
This  turned  out  to  be  an  excellent  idea  and  constitutes  one  of  the  major  findings  during  the 
funding  period.  We  isolated  RNA  from  control  and  CFIm25  knockdown  cells  and  performed 
RNA-Seq.  We  achieved  >11 5,000,000  reads  for  each  sample,  which  is  an  outstanding  dataset. 
Examples  of  the  data  are  shown  below  where  we  determined  the  read  density  difference  of 
cyclin  D1,  Dicerl,  and  Timp2  in  control  versus  CFIm25  knockdown  (Figure  8).  In  each  case,  it 
is  evident  that  downregulation  of  CFIm25  leads  to  a  reduction  in  read  density  in  the  3’  UTR 
proving  that  this  factor  regulates  APA.  Dr.  Wei  Li,  of  Baylor  University,  has  designed  software 

to  analyze  this  data  in  a  way  that  identifies  all 
instances  of  3’UTR  shortening.  We  now  have 
2,181  candidate  genes  that  are  regulated  by 
CFIm25.  Gene  ontology  and  pathway 
analysis  uncovered  the  fact  that  the  second 
most  affected  pathway  of  genes  is  androgen¬ 
signaling.  We  have  requested  a  six  month 
NCE  in  order  to  further  analyze  the  list  of 
genes  undergoing  APA  after  CFIm25 
knockdown  to  test  using  qRTPCR  in  cells  that 
are  androgen-dependent  (AR+)  versus 
androgen-independent  (AR-). 


Dicerl 


CvcIinDI 


Figure  8.  Read  Densities  generated  from  RNA- 
seq  data  from  control  siRNA  treated  cells  versus 
CFIm25  knockdown  cells. 
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Development  of  DPDUI  to  analyze  RNA-seq  data  to  identify  novel  APA  events.  As  mentioned  above, 
Dr.  Wei  Li  of  Baylor  University  helped  us  analyze  our  RNA-seq  data  to  find  APA  events.  This  has  not 
been  done  before  and  thus  represents  a  highly  novel  finding  garnered  from  the  granting  period.  Here,  we 
describe  briefly  the  algorithm  that  was  developed  and  why  it  is  so  powerful.  The  examples  of  the  RNA- 
seq  data  shown  in  Figure  8  were  found  by  manually  scanning  through  the  data,  which  has  the  obvious 
disadvantages  of  being  inefficient,  potentially  inaccurate,  and  not  quantifiable.  To  address  this.  Dr.  Wei 

Li  developed  an  algorithm  capable  of  measure  changes  in 
3’UTR  structure  in  response  to  CFIm25  knockdown. 
This  algorithm  identifies  the  differences  in  3’UTR  read 
density  between  control  siRNA  transfected  and  CFIm25 
siRNA  transfected  and  focuses  in  on  the  inflection  point 
where  the  two  datasets  diverge  (Fig.  9A).  Using  this 
algorithm,  we  found  that  CFlm25  is  a  global  repressor  of 
proximal  polyadenylation  as  789  genes  had  a  shorter 
3’UTR  and  only  35  had  a  longer  one  (Fig.  9A,  right 
panel).  This  list  of  CFIm25  regulated  genes  identified 
hundreds  of  novel  targets  and  representative  members  of 
that  group  are  shown  in  Figure  9B.  There  is  clear 
specificity  of  CFlm25  as  not  all  genes  exhibited  any 


Identification  of  Glutaminase  as  a  major  CFIm25  regulated  target.  This  unique  list  of  CFIm25- 
regulated  genes  has  uncovered  a  profound  effect 
on  the  glutaminase  gene  where  there  is  isoform 
specific  3'UTR  shortening  (Fig.  lOA). 

Glutaminase  encodes  two  isoforms  (GLSl/2) 
with  distinct  3'UTR  regions.  The  3'UTR  of  the 
GLS 1  isoform  is  specifically  regulated  by 
CFIm25  whereas  the  GLS2  is  not.  Western  blot 
analysis  from  CFIm25  knockdown  cells  confirms 
that  cells  have  an  alternative,  myc-independent 
pathway  to  activate  glutaminase  expression  that 
involves  CFlm25  and  APA  (Fig.  lOB/D).  Not 
only  does  CFIm25  cause  cells  to  grow  faster  (Fig 
1 OC)  but  also  the  removal  of  glutamine  causes 
growth  arrest  (Fig.  lOE/F).  Results  from  the 
Dang  lab  published  in  Nature  a  few  years  ago  (7), 
demonstrate  that  PC3  PCa  cells  overexpress  myc 
leading  to  mir-23  downregulation  and  subsequent 
GLS  de -repression  (Fig.  lOG).  We  believe  our 
results  uncover  a  new  myc-independent  pathway 
for  PCa  cells  to  overexpress  glutamainse  leading 
to  increased  proliferation. 

Figure  10.  CFImZS  regulates  APA  of  glutaminase.  A.  Genome  viewer  showing  that  the  GLSl  isoform  is  in 
lower  abundance  in  control  cells  and  its  3'UTR  is  shortened  after  knockdown.  B.  Protein  expression  of  GLS 
increases  dramatically  in  multiple  cell  lines  after  CFIm25  depletion.  C.  The  graph  in  the  middle  shows  the 
increased  growth  kinetics  in  cells  after  depletion  of  CFIm25.  D.  Same  as  B  except  293T  cells.  E.  Brightfield 
images  of  CFIm25  knockdown  cells  +/-  glutamine  supplementation.  F.  Graph  showing  glutamine  dependent 
growth  kinetics.  G.  Model  of  alternative  myc-independent  pathway  to  overexpress  GLS  using  APA. 


change  after  knockdown  (Fig.  9C). 

Figure  9.  Novel  Algorithm  to  identify  APA  events  in 
RNA-seq  data.  A.  Example  of  how  the  algorithm  analyzes 
altered  read  density  with  gene’s  3’UTR.  Right:  graphical 
analysis  showing  the  majority  of  altered  3’UTRs  exhibit 
shortening.  B.  Examples  of  novel  CFIm25  targets.  C. 
Examples  of  genes  unaffected  by  CFIm25  knockdown. 
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Aim  1b.  Analyze  the  dataset  from  deep  sequencing  and  perform  qRTPCR  validation  of  identified 
instances  of  APA. 


-The  experiments  described  above  were  performed  in  HeLa  and  293T  cells.  We  have  since  confirmed 
many  of  the  results  in  other  cell  lines  and  our  goal  during  the  NCE  period  was  to  analyze  CFlm25  APA 
regulated  events  in  PCa  cells  that  express  it  (e.g.  AR+  cells).  To  this  end,  we  performed  siRNA-mediated 
knockdown  of  CFlm25  in  FnCaP  cells  and  had  enormous  difficulty.  We  experienced  significant  cell 
death  that  was  due  to  transfection  reagent  and/or  conditions.  Optimizing  the  conditions  to  reduce  cell 
death  resulted  in  unconvincing  knockdown.  To  address  this,  we  purchased  five  different  lentiviruses  that 
express  shRNA  targeting  CFIm25.  These  lentiviral  vectors  also  allowed  us  to  use  puromycin  as  a  means 
of  selecting  for  stable  CFlm25  knockdowns.  This  approach  took  several  months  to  isolate  stable  lines 
and,  in  the  end,  only  one  control  and  one  knockdown  line  has  been  developed.  Western  blot  analysis 
shown  in  figure  X  demonstrates  that  we  do  have  stable  reduction  in 

CFlm25  in  this  line  and  the  cells  are  growing  well.  Unfortunately,  we  shRNA:  Con.  25 

have  not  been  able  to  analyze  any  of  the  CFlm25  targets  but  we 
anticipate  doing  this  in  the  near  future. 


aCFIm25 


Figure  11.  Western  blot  analysis  of  lentiviral-transduced  stable  LnCaP  cell  lines 

atub. 

overexpressing  shRNA  targeting  either  CIm25  or  a  control  scrambled  target. 

Loading  control  is  tubulin. 

LnCaP  cells 

Key  Research  Accomplishments 

-Setd2  does  not  regulate  genes  known  to  undergo  APA 
-CFIm25  is  downregulated  in  androgen  receptor  negative  PCa  cells 
-CFIm25  regulates  APA  of  genes  involved  in  androgen  signaling 
-CFlm25  regulates  GFS  expression  in  a  myc-independent  fashion. 


Reportable  outcomes: 

Abstracts: 

1 .  Identification  of  factors  that  drive  alternative  cleavage  and  polyadenylation 
in  cancer.  Chioniso  P  Masamha,  Ann-Bin  Shyu,  Wagner  EJ.  Poster 
presented  at  the  Eukaryotic  mRNA  Processing  meeting  (2011),  CSHL, 

NY. 


Manuscripts: 

1 .  *Wagner  EJ,  *Carpenter  PB*:  Translating  the  Language  of  H3K36.  Nature 

Reviews  in  Molecular  and  Cellular  Biology.  Jan  23;13(2):1 15-26,  2012 
[PMID:  22266761]  (*co-corresponding  authors) 

2.  Masamha  CP,  Zheng  Y,  Li  W,  Shyu  AB,  Wagner  EJ.  CFIm25  is  a 
Global  Repressor  of  Proximal  PolyA  Site  Selection  and  Suppresses 
Growth  (2012).  Manuscript  in  preparation. 
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Databases: 


Once  we  conclude  our  bioinformatics,  we  will  generate  a  searchable  database  for  CFIm25 
targets  that  will  be  cited  in  the  above  manuscript  and  accessible  through  the  Wagner  laboratory 
webpage. 


Funding  Applied  for: 

NIH  (1R01CA166274-01)-grant  submitted  in  June  2011  and  funded 
Title:  Developing  a  HTS  Assay  for  Inhibitors  of  Alternative  Cleavage  and 
Polyadenylation 
P.I.:  Eric  J.  Wagner 
Project  Period:  4/1/2012-  3/31/2014 

NIH  (1R03CA1 67752-01 A1  )-grant  submitted  February  2012  and  has  a  priority  score 
of  16,  which  is  within  the  funding  range 

Title:  Alternative  Cleavage  and  Polyadenylation  Events  as  Biomarkers 

P.I.:  Eric  J.  Wagner 

Project  Period:  12/1/2012  -  1 1/31/2014 

DoDBC:  (BC121045)-pre-proposal  approved  and  grant  submitted  August  2012 
Title:  Investigating  the  Role  of  Hypoxamirs  and  Alternative  Cleavage  and 
Polyadenylation  During  Tumor  Progression.  (Idea  Award) 

P.I.:  Eric  J.  Wagner 

Project  Period:  -5/1/2013  -  4/30/2016 


Conclusions 


This  proposal  was  to  be  carried  out  over  the  course  of  just  one  year  and  was  meant  to  test  a 
discrete  hypothesis.  While  we  were  disappointed  to  determine  that  Setd2  was  not  regulating 
APA  in  a  way  that  we  could  measure  the  identification  of  a  factor  that  regulates  APA  that  is 
altered  in  PCa  cells  is  exciting.  The  androgen  signaling  pathway  that  is  regulated  by  CFIm25  is 
an  important  finding  as  very  little  is  known  about  APA,  thus  this  insight  is  novel.  The  questions 
that  these  findings  have  generated  include:  how  is  CFIm25  downregulated  in  AR-  cells  and  what 
impact  does  this  have  on  the  APA  of  PCa  cells.  These  questions  are  new  ones  to  the  field  and 
will  act  as  a  platform  on  which  to  base  future,  more  substantial  research  proposals  and 
programs. 
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REVIEWS 


(^)  POST-TRANSLATIONAL  MODIFICATIONS 


Understanding  the  language  of  Lys36 
methylation  at  histone  H3 


Eric  J.  Wagner  and  Phillip  B.  Carpenter 

Abstract  |  Histone  side  chains  are  post-translationally  modified  at  multiple  sites,  including 
at  Lys36  on  histone  H3  (H3K36).  Several  enzymes  from  yeast  and  humans,  including  the 
methyltransferases  SET  domain-containing  2  (Set2)  and  nuclear  receptor  SET  domain- 
containing  1  (NSDl),  respectively,  alter  the  methylation  status  of  H3K36,  and  significant 
progress  has  been  made  in  understanding  howthey  affect  chromatin  structure  and  function. 
Although  H3K36  methylation  is  most  commonly  associated  with  the  transcription  of  active 
euchromatin,  it  has  also  been  implicated  in  diverse  processes,  including  alternative  splicing, 
dosage  compensation  and  transcriptional  repression,  as  well  as  DMA  repair  and 
recombination.  Disrupted  placement  of  methylated  H3K36  within  the  chromatin  landscape 
can  lead  to  a  range  of  human  diseases,  underscoring  the  importance  of  this  modification. 


Dosage  compensation 
The  mechanism  by  which 
expression  levels  from  sex 
chromosomes  are  adjusted. 

In  mammalian  systems,  one 
copy  of  the  X  chromosome 
is  silenced  In  the  female. 

By  contrast,  in  Drosophila, 
genes  on  the  male 
X  chromosome  are 
expressed  at  twofold  levels. 

SET  domain 

(Suppressor  of  variegation  3-9, 
Enhancer  of  Zeste  and 
Trithorax  domain). 

A  catalytic  domain  that  uses 
S-adenosylmethionine  to 
transfer  methyl  groups 
to  substrates. 


Department  of  Biochemistry 
and  Molecular  Biology, 

The  University  of  Texas 
Medical  School,  Houston, 
Texas  77050,  USA. 
e-mails: 

Eric.J.  Waaner@uth.  tmc.  edu: 

Phil  lip.  B.  CaroentenSuth.  tmc. 

edu 

doi:10.1038/nrm3274 


The  packaging  of  DNA  with  basic  histones  and  a  vast 
array  of  additional  factors  mediates  the  formation  of 
chromatin  and  thereby  determines  the  outcome  of  vir¬ 
tually  all  of  the  DNA  processes  in  eukaryotes.  Acetylation 
and  methylation  of  histones  were  originally  identified 
in  radiolabeUing  studies  using  cell  extracts^'^.  Today, 
enzymes  catalysing  numerous  histone  post-translational 
modifications  or  ‘marks’  have  been  identified;  these 
include  phosphorylation,  various  modes  of  methylation 
at  Arg  and  Lys  side  chains,  acetylation  and  ubiquitylation. 
These  marks  are  thought  to  exist  in  dynamic  combina¬ 
tions  to  generate  a  ‘code’,  or  ‘language’,  that  can  enforce 
the  regulatory  features  of  chromatin  during  nearly  all 
of  the  aspects  of  cellular  metabolism:  a  given  modifica¬ 
tion  or  permutation  of  modifications  dictates  a  distinct 
biological  output.  Given  the  critical  biological  roles  of 
chromatin  and  the  numerous  pathologies  related  to  its 
misregulation,  understanding  the  role  of  histone  modi¬ 
fications  is  a  paramount,  yet  complicated  task.  In  this 
Review,  we  focus  on  our  current  understanding  of  a  key 
modification,  the  methylation  of  Lys36  at  histone  H3 
(H3K36).  Widely  described  to  be  associated  with  active 
chromatin,  H3K36  methylation  has  also  been  implicated 
in  transcriptional  repression,  alternative  splicing,  dosage 
compensation,  DNA  replication  and  repair,  DNA  methyla¬ 
tion  and  the  transmission  of  the  memory  of  gene  expres¬ 
sion  from  parents  to  offspring  during  development.  We 
highlight  the  specificity,  regulation  and  functional  role  of 
the  enzymes  that  methylate  H3K36,  but  we  refer  readers 
to  informative  reviews  for  a  detailed  discussion  of  the 
counteracting  demethylases^"^. 


Regulating  the  methylation  of  H3K36 

Histone  methyltransferase  (HMTase)  enzymes  use 
S-adenosylmethionine  to  add  methyl  groups  to  speci¬ 
fic  histone  Lys  or  Arg  residues.  To  date,  at  least  eight 
distinct  mammalian  enzymes  have  been  described  that 
methylate  H3K36  in  vitro  and/or  in  vivo  (FIG.  1).  All  of 
the  H3K36-specific  methyltransferases  identified  thus 
far  have  the  catalytic  SET  domain  in  common,  but  they 
have  varying  preferences  for  Lys36  residues  in  different 
methylation  states.  In  yeast,  SET  domain-containing  2 
(Set2)  performs  all  three  methylation  events  at  H3K36 
(REF.  6),  but  in  higher  eukaryotes  there  is  accumulating 
evidence  that  these  events  require  a  division  of  labour 
between  the  mono-  and  dimethylases  and  the  SET2-type 
trimethylases  (FIG.  1  ].  Several  of  these  enzymes  also  possess 
multiple  chromatin-interacting  domains,  including  those 
known  to  interact  with  methylated  H3K36  itself  (such 
as  the  PWWP  domain)  and  with  additional  methylated 
histone  residues  (such  as  plant  homeodomain  fingers  (PHD 
fingers))^"’  (FIG.  1 ).  Notably,  SET2  proteins  from  multiple 
species  have  a  carboxy-terminal  domain  that  interacts 
with  the  large  subunit  of  RNA  polymerase  II  (RNAPII), 
which  is  known  as  RNAPII  subunit  BI  (RPBI)'". 

Determining  HMTase  specificity.  The  most  parsimonious 
scenario  would  predict  that  each  H3K36  methyl¬ 
transferase  enzyme  catalyses  a  transition  between  two 
distinct  methylation  states  (for  example,  between 
unmethylated  H3K36  and  monomethylated  H3K36 
(H3K36mel)).  However,  this  is  not  necessarily  the 
case.  There  are  many  discrepancies  as  to  the  level  of 
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Figure  1 1  Domain  structures  of  enzymes  that  methylate  H3K36.  a  |  A  schematic  of  the  enzymes  that  have  been 
shown  to  promote  the  formation  of  methylated  Lys36  on  histone  H3  (H3K36).  With  the  exception  of  fly  maternal-effect 
sterile  4  (MES-4),  only  human  enzymes  are  shown.  The  SET  domain  is  shown  with  its  pre  (AWS)and  post  domains; 
C5HCH  is  a  zinc-finger  (ZNE)  domain;  WW  domains  are  known  to  interact  with  Pro-rich  peptides;  the  PWWP  domain  is 
known  to  interact  with  trimethylated  H3K36;  and  the  AT  hook  is  a  DNA-binding  domain.  All  domain  assignments  were 
derived  from  Ensembl.  b  |  Depiction  of  the  transitions  between  the  multiple  H3K36  methylation  states,  highlighting  the 
enzymes  that  have  been  shown  to  function  in  changing  a  given  methylation  state.  ASH  IL,  ASH  1-like;  BAH,  bromo- 
associated  homology;  BROM,  bromodomain;  HMG,  high  mobility  group;  MYND,  myeloid.  Nervy  and  DEAF-IZNE;  NIDs, 
nuclear  receptor  interaction  domains;  NSD,  nuclear  receptor  SET  domain-containing;  PHD,  plant  homeodomain; 
RuBisCo,  ribulose-l,5-bisphosphate  carboxylase  oxygenase;  SETD,  SET  domain-containing;  SETMAR,  SET  domain  and 
mariner  transposase  fusion  gene-containing;  SMYD2,  SET  and  MYND  domain-containing  2. 


PWWP  domain 
(Pro-Trp-Trp-Pro  domain). 

A  chromatin-interacting 
domain  that  has  recentiy  been 
shown  to  bind  trimethyiated 
Lys56  on  histone  H5. 

Plant  homeodomain  fingers 
(PHD  fingers).  Motifs  that  bind 
zinc  and  have  been  shown  to 
bind  methyiated  residues. 


methylation  (mono-,  di-  and  tri-)  imparted,  as  well  as 
the  residue  (or  residues)  that  is  targeted,  by  the  various 
H3K36-specific  methyltransferases;  these  discrepancies 
may  arise  from  multiple  factors,  including  the  nature 
of  the  substrate  tested  (for  example,  peptides  and  his¬ 
tones  versus  nucleosomes),  the  source  of  enzyme  (for 
example,  full  length  versus  SET  domain  only)  and  the 
assay  conditions  themselves  (for  example,  antibody 
specificity  versus  mass  spectrometry).  Physiologically 
relevant  substrates,  characterized  using  nucleosomes 
and/or  loss-of-function  experiments,  have  been  deter¬ 
mined  for  some  H3K36  methyltransferases  (TABLE  1); 
but  for  enzymes  such  as  SETD3,  the  enzymatic  activi¬ 
ties  reported  so  far  are  based  on  analyses  of  peptides 
and  core  histones"'^^. 

NSDl  as  a  mono-  and  dimethylasefor  H3K36.  Although 
it  was  originally  shown  to  bind  steroid  nuclear  recep¬ 
tors^^,  nuclear  receptor  SET  domain-containing  1  (NSDl; 
also  known  as  KMT3B)  was  subsequently  reported  to  use 
its  SET  domain  to  methylate  H3K36  as  well  as  H4K20 
(REF.  1 4).  However,  NSDl  also  methylates  non-histone 
substrates,  such  as  the  p65  subunit  of  nuclear  factor-xB 
(NF-kB)'^.  Indeed,  several  HMTases  have  multiple  sub¬ 
strates,  including  non-histones,  and  it  is  important  to 


consider  this  when  interpreting  phenotypes  derived  from 
loss-of-function  experiments. 

NSDl  has  specific  mono-  and  dimethylase  activity 
for  H3K36  (REFS  16-18),  generating  H3K36mel  and 
dimethylated  H3K36  (H3K36me2),  and  there  are  also 
mixed  reports  as  to  whether  it  has  specificity  for  H4K20 
(REFS  14,19,20).  Enzymatic  assays  using  recombinant 
nucleosomes  containing  unmethylated  H3K36  or  a 
mimic  of  H3K36mel  (REF.  21)  were  shown  to  serve  as 
specific  substrates  for  NSDl  (REF.  1 6).  However,  when 
histone  octamers  were  used  as  substrates,  NSDl  was 
found  to  additionally  methylate  histone  H4  as  well  as 
histones  H2A  and  H2B,  suggesting  that  nucleosomes 
contribute  to  H3K36  specificity.  Structural  data  have 
also  confirmed  the  specificity  of  NSDl  as  a  mono-  and 
dimethylase  that  targets  H3K36  (REF.  1 8).  Intriguingly, 
this  study  showed  that  the  substrate-binding  channel  of 
NSD  1  is  blocked  by  an  autoinhibitory  loop  that  resides 
between  the  SET  and  post-SET  domains.  As  Lys36 
resides  near  the  core  of  the  nucleosome,  interactions 
between  the  enzyme  and  DNA  may  aUosterically  relieve 
inhibition  by  this  loop,  a  result  that  is  consistent  with 
the  preference  of  NSD  enzymes  for  nucleosome  sub¬ 
strates,  as  opposed  to  octamers.  This  further  under¬ 
scores  the  necessity  for  using  physiologically  relevant 
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Table  1 1  Reported  substrate  specificities  for  enzymes  that  methylate  H3K36 


Enzyme* 

Assay  conditions 

Histone  substrates 
identified 

Refs 

NSDl 

(KIVIT3B) 

Nucleosomes 

H3,H3K36.H3K36mel, 

H3K36me2,H4K20 

14, 

16-18 

Cell  based  and/or  in  vivo 

H3K36mel,H3K36me2, 

H4K20me3 

17-19, 

78 

NSD2 

(WHSCl,  MMSET) 

Nucleosomes 

H3,H3K36,H3K36mel, 

H3K36me2 

16,17, 

19,30 

Cell  based  and/or  in  vivo 

H3K27me2t,H3K36me2 

17,28, 

36 

NSD3(WHSC1L1) 

Nucleosomes 

H3,H3K36 

16 

Cell  based  and/or  in  vivo 

H3K36me2 

40 

ASHl;  ASHIL 

Nucleosomes 

H3,H3K36,H3K36me2, 

H3K36me35 

41,48, 

49 

Cell  based  and/or  in  vivo 

H3K4,  H3K4me3 

49 

SMYD2 

Core  histones  (octamers) 

H3,H3K4,H3K36me2 

42,43 

SETMAR 

(METNASE) 

Core  histones  (octamers) 

H3K4,  H3K36me2 

12 

Cell  based  and/or  in  vivo 

H3K36me2 

96 

SETD2  (KMT3A, 
HYPB) 

Nucleosomes 

H3,H3K36mel, 

H3K36me2,H3K36me3 

34,64 

Cell  based  and/or  in  vivo 

H3K36me3 

25,33,34 

SETD3 

Core  histones  (octamers) 

H3K4,H3K36 

11 

Table  entries  for  the  knockdown  (cell  based)  or  knockout  (in  vivo)  analysis  indicate  a  reported 
decrease  in  a  given  methylation  state.  ASHl,  Absent,  small  and  homeotic  discs  1;  ASHIL, 
ASHl-like:  H,  histone:  NSD,  nuclear  receptor  SET  domain-containing;  mel,  monomethylated; 
me2,  dimethylated:  me3,  trimethylated:  SETD,  SET  domain-containing;  SETMAR,  SET  domain 
and  mariner  transposase  fusion  gene-containing;  SMYD2,  SET  and  MYND  domain-containing  2. 
*Alternative  names  are  provided  in  brackets.  *RE-IIBP  isoform  of  NSD2.  ^Gln2265Ala  mutant 
of  ASHIL. 


Allosteric 

Pertaining  to  the  process  by 
which  a  binding  event  at  one 
site  influences  the  activity  of  an 
enzyme  or  protein  at  a  second, 
distant  site.  This  may  lead  to 
activation  or  inhibition  by 
cooperation  between  ligands, 
when  a  ligand  bound  at  one 
site  affects  the  affinity  of 
another  site  for  its  ligand  by 
inducing  transitions  between 
distinct  conformational  states. 


substrates  in  order  to  properly  characterize  enzyme 
specificity  towards  H3K36. 

NSDl  controls  the  levels  of  methylated  H3K36 
within  and  surrounding  the  body  of  the  bone  morpho- 
genic  protein  4  (BMP4)  gene  in  human  HCT116  colo¬ 
rectal  cancer  cells'^.  In  the  absence  of  NSDl,  BMP4 
expression  is  significantly  impaired  and  the  levels 
of  all  three  forms  of  methylated  H3K36  are  reduced 
within  the  body  of  the  gene.  These  data  could  support 
a  model  in  which  NSDl  catalyses  H3K36  trimethyla- 
tion  (H3K36me3),  but  a  more  likely  scenario  is  that  the 
loss  of  mono-  and  dimethylation  at  H3K36  also  inhibits 
trimethylation  by  failing  to  provide  another  HMTase, 
SETD2  (also  known  as  KMT3A  and  HYPB),  with  its 
substrate.  This  is  an  important  distinction  to  make  when 
considering  which  HMTases  regulate  the  formation  of  a 
trimethylated  residue. 

The  role  of  NSDl  as  a  mono-  and  dimethylase  also 
seems  to  be  conserved  in  other  metazoans.  Worms  and 
flies  each  have  one  NSDl  orthologue  called  maternal- 
effect  sterile  4  (MES-4)“‘^^.  In  Caenorhabditis  elegans, 
MES-4  has  been  reported  to  be  a  dimethylase  that  is  spe¬ 
cific  for  H3K36  (REP.  22).  Similarly  to  NSDl,  Drosophila 
melanogaster  MES-4  catalyses  global  mono-  and  dimeth¬ 
ylation  of  H3K36  in  vivo,  but  SET2,  the  fly  orthologue  of 
human  SETD2,  regulates  all  trimethylation  at  this  site; 
again,  MES-4  is  probably  required  to  provide  the  proper 
dimethylated  H3K36  substrate  for  the  SET2  enzyme. 


In  worms,  mutation  of  the  SETD2  orthologue  histone 
methyltransferase-like  1  (met-1)  significantly  reduced 
global  H3K36me3  levels^'*.  This  supports  the  notion 
that  the  worm  MES-4  and  MET-1  proteins  coordinate 
dimethylation  and  trimethylation,  respectively^.  These 
experiments  are  generally  consistent  with  studies  of 
mammalian  NSD  proteins  and  SETD2  (REFS  1 6,25).  By 
contrast,  separate  studies  showed  by  immunostaining 
that  worm  embryos  individually  deficient  in  mes-4 
or  met-1  retain  significant  levels  of  H3K36me3,  but 
embryos  doubly  deficient  in  mes-4  and  met-1  are  com¬ 
pletely  devoid  of  H3K36me3  (REFS  26,27).  This  sug¬ 
gests  that  MES-4  collaborates  with  MET-1  to  perform 
trimethylation.  As  each  of  these  studies  has  used  dif¬ 
ferent  antibodies,  it  will  be  important  to  validate  these 
experiments  using  additional,  antibody-independent 
techniques,  such  as  mass  spectrometry.  Collectively,  the 
combination  of  in  vitro  and  in  vivo  data  suggests  that 
NSDl  and  its  orthologues  probably  catalyse  the  addi¬ 
tion  of  either  the  monomethyl  or  dimethyl  groups  onto 
H3K36  and  so  indirectly  regulate  the  levels  of  trimethyl¬ 
ation  by  altering  the  availability  of  monomethyl  and 
dimethyl  substrates  for  the  trimethylating  enzymes. 

NSD2  as  a  mono-  and  dimethylase  for  H3K36.  Similarly 
to  NSDl,  NSD2,  the  product  of  Wolf-Hirschhorn 
syndrome  candidate  gene  1  (WHSCP,  also  known  as 
MMSET),  has  been  reported  to  target  H3K36,  as  well 
as  H4K20,  H3K4  and  H3K27  (REFS  1  6,28-32).  Here 
again,  these  discrepancies  may  lie,  at  least  in  part,  with 
the  nature  of  the  substrates  used  in  the  assays.  For 
example,  NSD2  acts  as  a  dimethylase  towards  H3K36 
when  presented  with  nucleosomes,  but  it  preferentially 
dimethylates  H4K44  when  presented  with  octamers^*’. 
Interestingly,  when  short  single- stranded  and  double- 
stranded  DNA  molecules  that  are  notably  smaller  than 
the  length  that  is  necessary  to  generate  a  nucleosome 
are  added  to  octamers,  NSD2  preferentially  dimethylates 
H3K36,  a  result  that  has  been  attributed  to  the  ability 
of  DNA  to  act  as  an  allosteric  effector  of  NSD2  (REF.  1 6). 
This  is  consistent  with  another  study,  in  which  mass 
spectrometry  and  immunoblotting  against  native  and 
recombinant  nucleosomes  showed  that  NSD2,  as  well 
as  its  major  splicing  variant,  RE-IIBP,  acts  as  a  mono- 
and  dimethylase  specific  for  H3K36  (REF.  30).  By  contrast, 
there  have  been  reports  that  the  NSD2  isoform  RE-IIBP 
is  an  H3K27-specific  HMTase^®  and  that  H3K36  trimeth¬ 
ylation  is  lost  in  embryonic  stem  (ES)  cells  derived  from 
NSD2-defective  mice®^.  But  the  idea  that  NSD2  might  act 
as  a  trimethylase  would  not  be  consistent  with  a  previous 
study  that  identified  SETD2  as  the  sole  trimethylase  in 
mammalian  cells^®.  Furthermore,  H3K36me3  levels  are 
reduced  when  SETD2  is  depleted,  despite  normal  levels 
of  H3K36me2  (REFS  33,34).  Regardless,  these  data  are  all 
consistent  with  a  role  for  NSD2  in  H3K36  methylation. 
This  is  also  supported  by  the  observation  that,  in  -15% 
of  patients  with  multiple  myeloma  (a  haematological 
malignancy  that  accounts  for  1%  of  aU  cancers®®),  NSD2 
is  overexpressed  as  a  result  of  a  translocation  with  the 
immunoglobulin  locus  (t(4;14)'^)  and  the  global  levels 
of  H3K36me2  are  increased®®. 
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Heterogeneous  nuclear 
ribonucleoprotein  L 
(hnRNPL).  A  member  of  a 
highly  abundant  family  of 
RNA-binding  proteins  known 
to  associate  with  newly 
synthesized  pre-mRNA. 

Pro  isomerization 
Peptide  bonds  to  Pro  residues 
can  exist  in  either  a  cis  or  a 
trans  state,  and  the  transitions 
can  be  catalysed  by  prolyi 
isomerase  enzymes. 

Anti-silencing  function  1 
(Asfl ).  A  histone  chaperone 
protein  that  associates  with 
newly  synthesized  histones 
and  is  involved  in  chromatin 
synthesis. 


Possible  specificity  ofNSD2forH4K20.  Despite  strong 
evidence  linking  NSD2  to  H3K36me,  it  may  also  act  as 
an  H4BC20- specific  methyltransferase  during  the  cellu¬ 
lar  response  to  DNA  double-strand  breaks  (DSBs)^^'^*. 
NSD2  was  isolated  in  an  RNA  interference  screen  for 
genes  involved  in  resistance  to  hydroxyurea,  a  DNA- 
replication  inhibitor^*.  Consistently,  NSD2-defective 
cells  are  sensitive  to  DNA  damage  and  NSD2  localizes  to 
sites  of  DNA  damage.  Moreover,  NSD2  is  phosphoryl- 
ated  at  Serl02  by  the  kinase  ataxia  telangiectasia 
mutated  (ATM)  upon  damage”  and  this  mediates  the 
recruitment  of  NSDl  to  sites  of  DNA  damage,  where 
it  has  been  suggested  to  dimethylate  H4K20,  which 
is  an  important  event  in  recruiting  the  DNA  damage 
response  regulator  p53-binding  protein  1  (53BP1)” 
Although  one  could  envision  that  NSD2  phosphoryla¬ 
tion  and  damaged  DNA  may  both  alter  the  catalytic 
specificity  of  NSD2  through  allosterism,  it  should  be 
noted  that  recombinant  NSD2  methylates  H4  in  in  vitro 
assays  with  H4  peptides^^.  Thus,  it  is  possible  that  NSD2 
does  have  some  specificity  for  H4  in  addition  to  H3 
even  in  the  absence  of  DNA  damage. 

The  Set2  trimethylase.  In  yeast,  Set2  is  non-essential, 
performs  all  three  methylation  reactions  at  H3K36,  and 
couples  H3K36me3  with  transcriptional  elongation 
through  an  interaction  with  Rpbl,  the  large  subunit  of 
RNAPII“.  The  human  orthologue  of  Set2,  SETD2,  also 
interacts  with  RNAPII  during  elongation,  but  it  is  an 
essential  protein  and  has  been  identified  as  a  huntingtin¬ 
interacting  protein.  The  role  of  SETD2  in  Huntington’s 
disease  is  obscure.  SETD2  is  a  trimethylase^^-^^’^'*,  and 
some  reports  have  indicated  that,  in  vitro,  recombi¬ 
nant  human  SETD2  can  add  all  three  methyl  groups 
to  H3K36  (REFS  32,34).  By  contrast,  in  vivo  knockdown 
experiments  targeting  SETD2  revealed  reduced  levels 
of  H3K36me3  only”.  This  underscores  the  importance  of 
using  multiple  approaches  to  determine  enzyme  speci¬ 
ficity  and  also  raises  an  important  point  about  enzyme 
processivity.  Does  a  trimethylase  obligatorily  require 
a  dimethyl  substrate  or  can  it  processively  place  all 
three  groups  on  a  Lys  substrate?  SETD2  associates  with 
heterogeneous  nuclear  ribonucleoprotein  L  (hnRNPL), 
and  hnRNPL  knockdown  analyses  show  decreased 
levels  of  H3K36me3  but  not  H3K36mel  or  H3K36me2 
(REF.  34).  Thus,  the  abihty  to  act  as  a  processive  enzyme 
could  depend  on  the  presence  and  regulation  of  dis¬ 
tinct  cofactors,  as  well  as  on  the  presence  of  a  previously 
methylated  substrate. 

Defining  further  HMTase  specificities  for  H3K36.  Like 
other  NSD  family  members,  the  HMTase  NSD3  (also 
known  as  WHSCILI)  appears  to  be  specific  for  H3K36 
(REFS  1 6,40).  So  far,  relatively  little  is  known  about  the 
substrate  specificities  of  the  Trithorax  protein  ASH  1 -like 
(ASHIL)  (an  orthologue  of  the  fly  protein  Absent, 
small  and  homeotic  discs  1  (ASHl)),  SET  domain  and 
mariner  transposase  fusion  gene-containing  (SETMAR; 
also  known  as  METNASE),  SETD3  and  SET  and 
MYND  domain-containing  2  (SMYD2).  Each  has  been 
reported  to  methylate  H3K36,  particularly  with  respect 


to  dimethylation,  but  other  substrate  specificities  have 
also  been  described  for  these  enzymes^^-'*'"'*^.  However, 
the  assays  for  SETD3  specificity  were  performed  with 
peptides”,  and  it  will  be  important  to  confirm  this 
enzyme  specificity  using  nucleosome-based  assays'*’’”. 

In  addition  to  targeting  H3K36,  SMYD2  methyl¬ 
ates  H3K4  and  non-histone  substrates,  such  as  p53  and 
retinoblastoma  (RB)”'”.  The  in  vitro  methylase  activity 
of  SMYD2  towards  H3K4,  but  not  towards  H3K36, 
depends  on  its  ability  to  bind  heat  shock  protein  90a 
(HSP90a)”.  Although  the  expression  of  SMYD2 
appears  to  be  largely  confined  to  the  brain  and  heart, 
its  conditional  deletion  from  mouse  cardiomyocytes 
showed  that  it  is  dispensable  for  heart  development 
and,  surprisingly,  its  deletion  had  no  effect  on  the  global 
levels  of  methylated  H3K36  (REF.  47).  This  may  suggest 
that  the  physiological  target  for  SMYD2  is  a  non-histone 
substrate,  providing  an  additional  example  of  how  the 
interpretation  of  knockout  phenotypes  for  HMTases 
must  be  carefully  considered. 

Rigorous  experiments  using  recombinant  nucleo- 
somes  have  demonstrated  that  ASHIL  is  a  dimethy- 
lase  that  is  specific  for  H3K36  (REFS  41 ,48).  However, 
ASHIL  has  also  been  reported  to  target  H3K4  and  to 
localize  within  the  transcribed  regions  of  active  target 
genes”;  and  in  vivo  levels  of  H3K4me3  were  reduced  in 
the  absence  of  ASHIL.  Additional  studies  have  shown 
that  ASHIL  also  targets  H3K36  and  H4K20  (REFS  44,46). 
Thus,  it  remains  to  be  seen  what  the  full  substrate  reper¬ 
toire  of  ASHIL  is.  Notably,  similarly  to  NSDl,  ASHIL 
has  an  autoinhibitory  loop  that  resides  between  the 
SET  and  post-SET  domains,  which  is  a  feature  that  has 
been  proposed  as  a  hallmark  feature  of  H3K36-specific 
enzymes'”'”. 

Additional  factors  that  influence  H3K36  methylation. 
Several  mechanisms  have  been  described  for  how 
H3K36  methylation  levels  are  regulated  at  a  given  locus 
(REFS  50-52).  In  addition  to  cofactors  such  as  HSP90a 
and  hnRNPL,  Pro  isomerization  of  histone  H3  and 
additional  histone  modifications  on  both  H3  and  H4 
have  been  shown  to  regulate  the  total  levels  of  H3K36 
methylation  by  Set2,  although  it  is  not  clear  how  this 
occurs.  Additional  studies  in  yeast  have  shown  that  cyc- 
lins  (encoded  by  bypass  UAS  requirement  1  (BURl)  and 
BUR2)  are  required  for  H3K36me3  (REF.  53).  Moreover, 
the  yeast  histone  chaperone  anti-silencing  function  1 
(Asfl),  the  RNAPII  kinase  C-terminal  domain  (CTD) 
kinase  1  (Ctkl)  and  the  elongation  factor  Spt6  (a  sub¬ 
unit  of  the  PACT  (facilitates  chromatin  transcription) 
complex)  all  regulate  the  levels  of  H3K36  trimethyla- 
tion  but  not  dimethylation”'”.  Other  elongation  factors, 
such  as  polymerase-associated  factor  1  (Pafl),  have  also 
been  shown  to  alter  the  levels  of  H3K36  methylation”. 
Large  cells  1  (Lgel),  a  factor  associated  with  the  ubiq- 
uitin  pathway,  was  also  identified  in  a  genome-wide 
screen  for  candidate  genes  that  specifically  regulate 
H3K36  methylation^^.  This  study  also  confirmed  the 
role  of  Pafl  in  regulating  H3K36me3:  mutations  in 
Pafl  disrupted  the  levels  of  H3K36me3,  but  not  those  of 
H3K36me2  (REF.  56).  Collectively,  these  studies  suggest 
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Figure  2  |  H3K36me3-dependent  prevention  of  aberrant  transcription  in  yeast. 

Actively  transcribing  RNA  polymerase  II  (RNAPII)  displaces  acetylated  nucleosomes. 
These  evicted  histones  are  reincorporated  into  nucleosomes  and  chromatin  behind 
the  polymerase.  As  SET  domain-containing  2  (Set2)  binds  RNAPII,  this  promotes  the 
trimethylation  of  Lys36  on  histone  H3  (H3K36me3)  in  the  newly  incorporated 
nucleosomes.  H3K36me3  serves  as  a  ‘mark’  that  the  reduced  potassium  dependency  3 
(Rpd3)  deacetylase  complex  binds;  this  complex  facilitates  local  nucleosome 
deacetylation,  preventing  aberrant,  spurious  transcription  in  the  wake  of  RNAPII 
progression  through  a  region.  CTD,  carboxy-terminal  domain;  HAT,  histone  acetylase; 
m^G,  7-methylguanosine. 


an  emerging  theme;  that  HMTase  cofactors  may  alter 
substrate  specificity  on  a  gene-to-gene  basis,  providing 
an  additional  layer  of  regulation  for  their  activity  at 
distinct  targets. 

Roles  of  H3K36  methylation  in  gene  expression 

Methylation  of  H3K36  has  been  observed  at  multiple 
stages  of  RNA  biosynthesis  and  is  not  always  associ¬ 
ated  with  gene  activation.  Rather,  its  ultimate  output 
and  where  it  fits  into  gene  expression  pathways  are 
functions  of  multiple  variables;  where  in  the  gene  body 
the  methylated  H3K36  mark  is  placed,  when  H3K36  is 
methylated  and  what  reader  protein  binds  to  this  modi¬ 
fication.  Further  complicating  these  variables  is  the  fact 
that  the  degree  of  methylation  can  result  in  different 
biological  outcomes. 


Reader  protein 
A  protein  that  recognizes 
and  binds  post-translational 
modifications  on  histones. 

Chromodomain 
A  domain  of  ~  50  residues 
that  has  been  shown  to  bind 
methylated  residues. 


H3K36  methylation  and  transcriptional  activation. 
Numerous  studies  in  multiple  systems  support  a  role 
for  H3K36  methylation  in  transcriptional  activation.  It 
has  been  observed  that,  in  general,  there  is  a  progres¬ 
sive  shift  from  monomethylation  to  trimethylation  of 
H3K36  between  the  promoters  and  the  3'  ends  of  genes™. 
However,  in  zebrafish,  somatic  cells  also  show  a  bias  for 
H3K36me3  in  the  3'  end  of  actively  transcribed  genes,  but 
this  mark  is  curiously  present  in  the  5'  promoter  regions 
of  quiescent  genes  that  are  developmentally  regulated 
during  spermatogenesis™.  The  functional  significance  of 
this  is  unknown,  and  it  will  be  important  to  determine 
whether  promoter- proximal  H3K36me3-containing 
nucleosomes  can  recruit  repressive  histone  deacetylases 
to  promote  gene  silencing. 

Multiple  lines  of  evidence  in  budding  yeast  have 
showed  that  Set2,  a  non-essential  enzyme  that  is  respon¬ 
sible  for  aU  three  forms  of  methylated  H3K36,  is  coupled 
to  transcriptional  elongation.  Deletions  in  yeast  Set2 
cause  sensitivity  to  the  elongation  inhibitor  6-azauracil 


and  phenocopy  mutations  in  other  known  elonga¬ 
tion  factors^.  However,  it  remains  possible  that  this 
phenotype  may  arise  from  indirect  effects  on  nucleo¬ 
tide  metabolism.  Nevertheless,  Set2  associates  with  the 
hyperphosphorylated  form  of  RNAPII  and  deposits 
the  trimethyl  group  onto  H3K36  during  elongation 
in  various  systems,  including  yeast  and  humans.  This 
interaction  is  regulated  by  phosphorylated  residues 
in  the  CTD  of  Rbpl,  the  large  subunit  of  RNAPII;  the 
Ser2-phosphorylated  form  of  the  CTD  is  indicative  of 
elongation  and  the  Ser5-phosphorylated  form  is  charac¬ 
teristic  of  a  paused  polymerase  at  a  promoter®^.  Indeed, 
human  SET2  proteins  also  appear  to  bind  RNAPII  and 
to  target  H3K36  (REFS  34,64). 

One  well-established  function  of  Set2  in  yeast 
is  the  prevention  of  aberrant  transcriptional  initia¬ 
tion  within  coding  sequences  (FIG.  2).  Set2  catalyses 
H3K36  methylation  co-transcriptionally  and  recruits 
the  reduced  potassium  dependency  3  small  (Rpd3S) 
deacetylase  complex  here*’^"®  through  association  of 
the  chromodomaln-containing  Rpd3S  subunit  ESAl- 
associated  factor  3  (Eaf3)  with  H3K36me3;  this  enforces 
a  deacetylated  chromatin  state  in  the  wake  of  transcrib¬ 
ing  RNAPII*’^-'’*.  Rpd3S  preferentially  associates  with 
histones  containing  H3K36me2  and  H3K36me3  but 
not  H3K36mel  (REF.  57).  The  PHD  finger  of  the  Rpd3S 
subunit  Rcol  can  also  cooperate  with  the  Eaf3  chromo¬ 
domain  to  promote  these  interactions™.  This  indicates 
that  H3K36me2  and  H3K36me3  might  act  redundantly 
in  the  Set2-Rpd3S  pathway. 

The  role  of  histone  Lys  methylation  in  the  main¬ 
tenance  of  a  repressive  chromatin  environment  during 
transcriptional  elongation  is  also  used  in  humans  but 
may  be  independent  of  acetylation™.  In  this  case,  Lys 
demethylase  2  (LSD2;  also  known  as  KDMIB)  demeth¬ 
ylates  methylated  H3K4  in  the  intragenic  regions  of 
active  target  genes.  LSD2  resides  in  complexes  with 
NSD3,  which  acts  as  a  mono-  and  dimethylase  that  is 
specific  for  H3K36  (REF.  1 6).  Additionally,  LSD2  binds  the 
H3K9-specific  HMTase  G9A  (also  known  as  EHMT2). 
Consistent  with  this,  the  LSD2  complex  has  robust 
HMTase  activity  towards  H3K9  and,  to  a  lesser  extent, 
H3K36  (REF.  70).  Moreover,  in  chromatin  immuno- 
precipitation  followed  by  microarray  (ChIP-chip) 
experiments,  depletion  of  G9A  alters  both  H3K9  meth¬ 
ylation  levels  and  transcriptional  regulation  at  LSD2 
targets.  In  light  of  these  observations  and  the  fact  that 
LSD2  resides  in  complexes  with  elongation  factors  such 
as  cyclin  T1  (CCNTl;  also  known  as  PTEFb)  and  the 
Ser2-phosphorylated  form  of  RNAPII,  it  appears  that 
the  methylation-dependent  enforcement  of  a  repressive 
chromatin  state  through  H3K9  and  possibly  H3K36 
methylation  at  actively  transcribed  genes  is  evolution- 
arUy  conserved.  Additionally,  it  seems  that  NSD3  can 
participate  in  transcriptional  elongation™.  Interestingly, 
the  LSD2  complex  contains  multiple  proteins  with 
PWWP  domains,  including  NSD3.  Originally  identi¬ 
fied  in  NSD2,  PWWP  domains  also  bind  H3K36me3 
to  act  as  reader  proteins^'*'^T  In  particular,  the  PWWP 
domain  of  DNA  methyltransferase  3A  (DNMT3A) 
binds  H3K36me3  and  methylates  nearby  DNA, 
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demonstrating  a  link  between  H3K36me3  and  DNA 
methylation  that  could  represent  a  new  mechanism 
for  H3K36  methylation-mediated  repression  of  gene 
expression. 

H3K36  methylation  in  dosage  compensation. 
Coordination  between  methylation  and  acetylation 
also  occurs  during  dosage  compensation  in  D.  melano- 
gaster^^.  To  compensate  for  having  only  one  X  chromo¬ 
some,  male  flies  use  the  Male-specific  lethal  (MSL) 
complex  to  upregulate  the  expression  of  X-linked  genes 
by  a  factor  of  two  in  order  to  achieve  the  same  level  of 
transcription  as  females,  which  have  two  X  chromo¬ 
somes.  This  may  occur  through  a  two-step  model  requir¬ 
ing  H3K36me3  and  increased  elongation^^"”.  The  MSL 
complex  is  initially  recruited  to  X-linked  genes  through 
a  GA-rich  recognition  site  and  subsequently  facilitates 
transcriptional  elongation  through  an  enhancement 
of  RNAPII  activity.  Next,  the  MSL3  subunit  (an  ortho- 
logue  of  yeast  Eaf3)  is  thought  to  promote  spreading 
of  the  upregulated  state  through  interactions  that  involve 
the  chromodomain  of  MSL3  and  H3K36me3  (REF.  73). 
Additional  active  marks,  such  as  H4K16  acetylation, 
also  affect  X-linked  gene  expression,  requiring  the  Males 
absent  on  the  first  (MOT)  subunit  of  MSL.  Thus,  a  collec¬ 
tion  of  activating  marks  together  achieves  chromosome¬ 
wide  gene  expression^"*.  Although  interactions  between 
chromodomain  proteins  and  H3K36me3  marks  regulate 
acetylation  in  both  yeast  and  D.  melanogaster,  they  have 
opposite  effects:  transcription  is  repressed  in  yeast  but 
upregulated  in  flies.  This  clearly  illustrates  that  the  func¬ 
tion  of  a  given  histone  modification,  such  as  H3K36me3, 
is  context-dependent  and  is  much  more  complicated 
than  a  simple  code. 

NSD  enzymes  and  transcriptional  initiation.  NSDl 
regulation  of  H3K36  methylation  appears  to  affect 
transcriptional  initiation*'.  For  example,  NSDl  binds 
upstream  of  the  promoter  of  BMP4  and  regulates  the 
levels  of  H3K36mel,  H3K36me2  and  H3K36me3 
within  the  body  of  the  gene.  This  seems  to  be  required 
for  the  recruitment  of  RNAPII  to  the  BMP4  promoter, 
an  observation  that  links  NSDl  to  initiation  events 
through  RNAPII.  NSD3  can  also  bind  promoters  and 
then  influence  the  levels  of  H3K36  methylation  within 
the  body  of  a  gene"***.  For  example,  the  extraterminal 
domains  of  bromodomain-containing  4  (BRD4)  specifi¬ 
cally  recruit  NSD3  to  the  promoter  region  of  the  gene 
encoding  CCNDl  and  the  decapping  enzyme  DCPS, 
and  loss  of  either  BRD4  or  NSD3  reduces  the  levels  of 
H3K36me3,  particularly  within  the  body  of  the  gene. 
This  is  consistent  with  the  NSDl-dependent  regulation 
of  BMP4,  during  which  promoter-proximal  bound 
NSDl  regulates  H3K36me3  levels  within  the  body 
of  the  gene*'.  Although  NSD3  localization  is  biased 
towards  the  promoter  of  CCNDl,  it  also  can  be  found, 
albeit  at  reduced  levels,  at  the  3'  end  of  the  gene"***,  sug¬ 
gesting  that  NSD3  participates  in  initiation  and  elon¬ 
gation  events.  Moreover,  NSD3  levels  were  found  to 
be  higher  in  the  coding  regions  than  in  the  promoter 
of  the  proto-oncogene  Ser/Thr  kinase  PIM2  (REF.  40). 


Thus,  NSD3  appears  to  reside  in  multiple  complexes 
(including  LSD2  and  BRD4  complexes)  and  can  local¬ 
ize  to  either  promoter  or  internal  regions  to  promote 
H3K36  methylation  and  thereby  influence  initiation 
and  elongation  processes.  This  is  in  apparent  contrast 
to  NSDl,  which  has  been  reported  to  localize  primar¬ 
ily  near  the  5'  ends  of  its  targets*';  it  will  be  important 
to  confirm  this  using  high-resolution  ChIP  followed 
by  sequencing  (ChIP-seq)  analysis.  However,  pre¬ 
vious  ChIP-seq  data  show  that  NSD2  is  enriched  near 
transcription  start  sites^**. 

H3K36  methylation  in  transcriptional  repression. 
Fascinating  clues  have  emerged  regarding  the  relation¬ 
ship  between  methylation  at  H3K36,  transcriptional 
repression  and  worm  development^-".  In  worms, 
maternally  provided  MFS-4  is  essential  for  germ  cell 
viability  and  is  involved  in  X-chromosome  silencing  in 
the  germline"*-".  Early  primordial  germ  cells  (PGCs) 
do  not  engage  in  transcription  as  they  lack  the  active 
hyperphosphorylated  CTD  form  of  RNAPII.  But,  in 
the  absence  of  mes-4,  active  RNAPII  persists,  indicating 
that  MES-4  is  important  for  the  establishment  and/or 
maintenance  of  transcriptional  repression  in  late  stage 
EGCs"*'".  Loss  of  MES-4  also  reduces  the  levels  of 
H3K36me3,  independently  of  RNAPII  function.  Thus, 
MES-4  maintains  the  H3K36me3  state  in  germline  pre¬ 
cursors  independently  of  transcription  —  an  observation 
that  has  not  been  made  in  other  organisms  so  far. 

Beyond  H3K36  methylation-mediated  regulation 
of  spurious  initiation,  H3K36  methylation-mediated 
repression  has  also  been  reported.  Yeast  Set2  represses 
transcription  from  a  lacZ  reporter  and,  in  Set2-deletion 
mutants,  the  basal  levels  of  GAL4  transcription  are 
increased"*'";  however,  in  these  studies,  Set2  is  artifi¬ 
cially  recruited  to  the  promoter,  so  it  will  be  important 
to  confirm  whether  this  reflects  the  true  nature  of  Set2 
activity.  Initial  studies  with  tethered  reporters  in  mam¬ 
malian  cells  have  implicated  NSDl  in  both  transcrip¬ 
tional  activation  and  repression*^,  although  its  targets 
have  not  yet  been  determined.  In  addition,  an  RNA 
interference  study  has  shown  that  NSDl  represses 
the  expression  of  the  homeobox  regulator  MEISl  in 
neuronal  cell  lines***.  This  repressive  event  is  likely  to 
be  direct,  as  NSDl  localizes  to  the  promoter  of  MEISl 
(REF.  78).  Furthermore,  NSD2  collaborates  with  the 
cardiac-specific  factor  NKX2-5  to  repress  targets  such 
as  platelet-derived  growth  factor-a  (PDGFRa),  probably 
through  modulating  H3K36  methylation  levels^'.  Thus, 
H3K36  methylation  appears  to  act  as  both  an  activating 
and  inhibitory  signal,  and  the  overall  biological  readout 
might  depend  on  the  context  of  additional  surrounding 
marks  and  their  corresponding  reader  proteins. 

H3K36  methylation  and  exon  definition.  There 
has  been  increasing  support  for  the  idea  that  biased 
nucleosome  positioning  favours  exonic  over  intronic 
sequences'**"**.  Such  a  bias  seems  to  be  accompanied  by 
distinct  histone  modifications,  including  H3K36me3. 
This  should  not  be  surprising,  as  the  147-base-pair 
length  of  a  DNA  fragment  associated  with  a  histone 
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FCFR2  in  epithelial 


Figure  3  |  H3K36me3  influences  alternative  splicing  in  a  cell-type  specific  manner. 

The  fibroblast  growth  factor  receptor  2  (FCFR2)  locus  that  undergoes  alternative  splicing 
consists  of  two  mutually  exclusive  exons,  lllband  lllc,  which  are  located  between  the 
constitutive  exons  7  and  10.  Mesenchymal  stem  cells  favour  the  inclusion  of  exon  lllc  and 
achieve  this  byrepressing  splicing  of  exon  Nib.  Nucleosomes  present  near  exon  lllb 
contain  the  SET  domain-containing  2  (SETD2)-dependenttrimethylated  Lys36  on  histone 
H3  (Fi3K36me3)  ‘mark’  and  its  reader  protein  MORF-related  gene  15  (MRG15).  MRC15 
also  interacts  with  polypyrimidine  tract-binding  protein  (PTB),  a  known  repressor  of  exon 
inclusion,  and  this  may  be  the  mechanism  by  which  the  methylated  FI3K36  mark  can 
influence  splicing  at  this  locus.  In  epithelial  cells,  FCFR2  expresses  exon  lllb  but  excludes 
exon  lllc.  Epithelial  splicing  regulatory  protein  (ESRP)  is  expressed  and  stimulates  the 
inclusion  of  exon  lllb;  reduced  levels  of  FI3K36me3  present  at  this  exon,  possibly  as  a  result 
of  lower  SETD2  levels,  allow  its  derepression.  The  role  of  dimethylases,  such  as  the 
proteins  of  the  nuclear  receptor  SET  domain-containing  (NSD)  family,  in  this  process  has 
yet  to  be  determined  but  these  enzymes  could  also  influence  FI3K36  methylation  here. 


Polypyrimidine  tract-binding 
protein 

(PTB).  A  protein  that  has  been 
implicated  as  an  antagonist  of 
exon  definition,  the  action 
of  which  results  in  the 
repression  of  exon  inclusion. 

Fibroblast  growth  factor 
receptor  2 

(FGFR2).  A  membrane-bound 
receptor  that  undergoes 
alternative  splicing  and  is 
subject  to  regulation  by 
methylation  of  Lys56  on 
histone  H3. 

Epithelial  splicing  regulatory 
protein 

(ESRP).  An  alternative  splicing 
factor  that  is  enriched  in 
epithelial  tissues  and  is 
responsible  for  enforcing 
specific  exon  inclusion. 


octamer  correlates  with  the  average  length  of  an  internal 
exon.  This  is  likely  to  be  more  than  just  mathematical 
serendipity;  it  is  probably  evidence  of  interplay  between 
chromatin  and  splicing.  Several  large-scale  bioinfor¬ 
matics  studies  have  analysed  both  the  positions  of 
nucleosomes  and  their  modification  status  within  the 
genomes  of  humans,  C.  elegans,  D.  melanogaster  and 
mice™"®'.  In  each  case,  nucleosomes  were  enriched 
specifically  at  exonic  sequences.  Although  the  increased 
deposition  of  nucleosomes  at  exons  guarantees  a  bias 
in  histone  modifications  within  exons  relative  to  those 
within  introns,  it  is  also  clear  that  a  subset  of  modifica¬ 
tions  is  specifically  enriched  here.  This  is  particularly 
true  for  H3K36me3  but  also  includes  methylation  at 
H3K79,  H4K20  and  H2BK5  (REF.  80).  Each  analysis  also 
found  that  the  H3K36me3  bias  is  more  pronounced 
within  exons  further  downstream  of  the  transcription 
start  site.  This  preference  may  reflect  the  propensity  of 
RNAPII  to  abort  transcription  early  in  the  transcription 
cycle,  thereby  reducing  the  number  of  nucleosomes  that 


are  displaced  further  downstream.  The  known  asso¬ 
ciation  between  Set2  and  the  RNAPII  CTD  may  also 
further  explain  the  particular  increase  in  H3K36me3 
signatures  seen  at  downstream  exons®'’*^.  The  impli¬ 
cation  of  nucleosome  enrichment  and  the  increase  in 
H3K36me3  modifications  is  twofold.  First,  nucleosomes 
probably  act  as  intrinsic  pause  sites  for  elongating 
RNAPII  which  could  alter  splice  site  recognition  and, 
hence,  change  exon  inclusion.  Others  have  found  that 
the  introduction  of  pause  sites  within  minigenes  can 
increase  the  inclusion  of  alternatively  spliced  exons®®'®'*. 
Furthermore,  expression  of  a  ‘slow’  mutant  RNAPII  in 
D.  melanogaster  results  in  different  inclusion  patterns  of 
the  exons  within  the  Ultrabithorax  mRNA®®.  A  second 
possibility,  which  is  not  mutually  exclusive  with  effects 
on  RNAPII  pausing,  is  that  the  H3K36me3  modifica¬ 
tion  relays  a  specific  signal  to  the  splicing  machinery 
to  alter  how  it  defines  exons,  leading  to  the  specific 
inclusion  or  exclusion  of  particular  exons. 

Although  the  global  analyses  of  H3K36me3  posi¬ 
tioning  in  various  genomes  provide  compelling  evi¬ 
dence  that  this  modification  affects  splicing,  functional 
evidence  beyond  this  has  been  lacking.  However,  an 
interesting  connection  has  been  made  between  SETD2, 
the  reader  protein  MORF-related  gene  15  (MRG15; 
which  contains  a  chromodomain)  and  polypyrinriidine 
tract-binding  protein  (PTB),  the  last  of  which  is  a  known 
antagonist  of  exon  definition  that  affects  splicing  of 
fibroblast  growth  factor  receptor  2  {FGFR2)  pre-mRNA®®'®®. 
FGFR2  contains  two  mutually  exclusive  exons  (Illb 
and  IIIc)  that  encode  a  region  within  the  extracellular 
immunoglobulin-like  domain  and  are  each  responsi¬ 
ble  for  receptor  binding  to  a  unique  range  of  FGFs®*. 
Exon  mb  is  included  in  epithelial  cells  through  the 
action  of  epithelial  splicing  regulatory  protein  (ESRP), 
whereas  exon  IIIc  is  included  in  cells  of  mesenchy¬ 
mal  origin®®  (FIG.  3).  Furthermore,  the  splicing  pattern 
switches  from  exon  Illb  to  exon  IIIc  inclusion  as  pros¬ 
tate  epithelial  cells  become  androgen-independent,  an 
important  factor  in  metastasis.  Analysis  of  nucleosome 
modifications  throughout  FGFR2  show  that  H3K36me3 
is  specifically  enriched  within  exon  Illb  and  is  restricted 
to  mesenchymal  cells,  which  exclude  this  exon.  This 
H3K36me3  modification  is  recognized  by  MRG15, 
which  also  interacts  with  PTB.  Thus,  by  recruiting  PTB 
to  its  target  exon,  these  interactions  position  PTB  to 
bind  to  its  intronic  splicing  silencer  sites,  which  flank 
the  repressed  exon  as  they  emerge  from  the  transcribing 
RNAPII  complex  (FIG.  3).  PTB  repression  of  this  exon 
can  be  alleviated  by  downregulating  either  MRG15  or 
SETD2.  This  regulatory  module  also  exists  at  other 
alternatively  spliced  exons,  with  a  bias  towards  exons 
that  contain  weaker  PTB-binding  sites*®.  What  remains 
to  be  seen  is  how  two  distinct  cell  types  achieve  this  dif¬ 
ferential  methylation  of  H3K36  within  nucleosomes  at 
alternatively  spliced  exons  in  order  to  regulate  splicing. 

This  example  of  FGFR2  control  exemplifies  how 
the  methylation  status  of  H3K36  can  affect  splicing. 
However,  this  crosstalk  is  bidirectional:  mutations  in 
splice  sites  that  abrogate  intron  removal  of  |3-globin 
reporter  genes  cause  a  shift  in  H3K36me3  signatures 
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Non-homologous  end 
joining 

(NHEJ).  The  main  pathway 
that  is  used  primarily  in  the 
G1  phase  of  the  cell  cycle  to 
repair  chromosomal  DNA 
double-strand  breaks  in 
somatic  cells.  In  contrast  to 
homologous-recombination 
repair,  NHEJ  is  error-prone 
because  it  leads  to  the  joining 
of  heterologous  ends. 


Figure  4  |  Model  for  H3K36  methylation  at  sites  of 
DNA  damage.  Upon  the  generation  of  DNA  damage, 

SET  domain  and  mariner  transposase  fusion  gene- 
containing  (SETMAR)  dimethylates  Lys36  on  histone  H3 
(H3K36)  near  sites  of  DNA  double-strand  breaks  (DSBs), 
possibly  by  recruiting  currently  undefined  reader  proteins 
that  facilitate  the  binding  of  KU70  and  the  MRE11-RAD50- 
NBSl  (MRN)  complex  to  facilitate  DNA  repair.  How 
methylation  at  H3K36  is  coordinated  with  other  chromatin 
modifications  that  are  known  to  participate  in  DNA  repair 
(such  as,  p53-binding  protein  1  (53BP1)  binding  to 
dimethylated  H4K20  {H4K20me2))  is  unknown.  NSD2, 
nuclear  receptor  SET  domain-containing  2. 


towards  the  3'  region  of  genes*.  Moreover,  others  have 
observed  that  H3K36me3  signatures  are  markedly  lower 
in  genes  without  introns^f  In  both  cases,  inhibition  of 
splicing  with  pharmacological  agents  causes  a  rapid 
and  global  redistribution  of  H3K36me3  throughout 
the  genome  and  reduces  SETD2  recruitment  to  these 
genes’^’h  Although  it  is  unclear  how  the  spliceosome 
regulates  H3K36  methylation,  these  studies  suggest  that 
it  is  a  general  phenomenon. 

H3K36  methylation  in  DNA  replication,  recombination 
and  repair.  As  replication  origins  initiate  at  distinct 
times  in  S  phase,  the  decision  to  fire  at  any  particular 
origin  depends  on  a  number  of  factors,  including 
the  transcriptional  status  of  nearby  genes.  Origins  of 
replication  are  bound  by  the  origin  recognition  com¬ 
plex  (ORC),  which  recruits  various  factors,  including 
CDC6,  minichromosome  maintenance  (MCM)  pro¬ 
teins  and  CDC45,  before  loading  of  the  DNA  poly¬ 
merase.  Studies  in  budding  yeast  have  revealed  a  role 
for  H3K36  methylation  in  regulating  replication  ori¬ 
gin  firing,  as  deletions  in  SET2  cause  a  delay  in  Cdc45 
loading  at  origins*.  H3K36  methylation  has  also  been 
linked  to  DNA  replication  checkpoint  control  in  fission 
yeast*.  Furthermore,  yeast  mutants  in  the  FACT  elonga¬ 
tion  complex  are  sensitive  to  the  replication  inhibitor 
hydroxyurea,  and  this  sensitivity  can  be  suppressed  by 
mutations  in  SET2  (REF,  94].  How  H3K36  methylation 
influences  these  S  phase  events  is  unknown,  but  it  could 
keep  chromatin  in  an  active  state,  perhaps  through 
recruiting  specific  reader  proteins  that  can  influence 
DNA  replication  control. 


Histone  methylation  is  also  critical  for  the  main¬ 
tenance  of  genomic  stability,  including  the  biological 
response  to  DSBs*-*.  For  example,  H3K36me2  has  been 
implicated  in  the  repair  response  to  DSBs  through  the 
non-homologous  end-joining  (NHEJ)  pathway*.  Here,  the 
methylase  SETMAR,  a  SET  domain-containing  enzyme 
previously  reported  to  affect  NHEJ  as  well  as  DNA  rep¬ 
lication,  was  shown  to  directly  mediate  H3K36me2  for¬ 
mation  near  sites  ofDSBs’'®(FIC,  4).  SETMAR-dependent 
formation  of  H3K36me2  was  shown  to  enhance  the 
rate  of  association  of  the  DNA  repair  factors  KU70  and 
Nijmegen  breakage  syndrome  1  (NBSl;  also  known  as 
nibrin)  near  DSBs*.  However,  it  is  not  yet  clear  how  meth¬ 
ylated  H3K36  acts  at  the  sites  of  DSBs,  particularly  with 
respect  to  which  proteins  may  respond  to  the  presence  of 
this  mark,  or  what  relationship  this  mark  might  have  with 
other  known  histone  modifications  that  are  important  for 
DSB  repair.  Most  notably,  this  may  include  crosstalk  with 
dimethylation  of  H4K20,  an  event  that  recruits  the  DNA 
damage  response  regulator  53BP1  to  sites  of  damage^’.  It  is 
also  possible  that  methylated  H3K36  in  this  context  medi¬ 
ates  its  effects  by  antagonizing  transcriptional  elongation, 
as  RNAPII  is  displaced  in  response  to  DNA  damage*. 

H3K36  methylation:  links  with  disease 

In  higher  eukaryotes,  defects  in  the  genes  that  maintain 
the  levels  of  H3K36  methylation  cause  developmental 
defects  and  disease*'*'^*  (TABLE  2).  Defects  in  SETD2  are 
causal  for  sporadic  clear  renal  cell  carcinoma,  a  disease 
that  is  marked  by  the  loss  of  the  short  arm  of  chromo¬ 
some  3  (REF,  99).  SETD2  has  also  been  hypothesized  to 
be  a  tumour  suppressor  in  breast  cancer'™.  Defects 
in  each  member  of  the  NSD  family  have  been  imphcated  in 
multiple  diseases  and  cancer  types"’'’*-''"’'*.  For  example, 
haploinsufficiency  in  NSDl  is  causal  for  Sotos  overgrowth 
syndrome*.  NSDl  has  also  been  imphcated  in  breast,  lung 
and  prostate  cancer,  acute  myeloid  leukaemia  (AML)  and 
refractory  anaemia"'’"’*’""’'''*'''’^’"^.  NSDl-defective  mice 
display  an  embryonic-lethal  phenotype:  the  embryos  initi¬ 
ate  mesoderm  formation  but  fail  to  complete  gastrulation, 
apparently  as  a  result  of  apoptosis'"'.  NSD2-defective  mice 
die  shortly  after  birth  and  display  features  that  are  consist¬ 
ent  with  Wolf-Hirschhorn  syndrome*,  including  facial 
abnormalities  and  cardiac  defects.  Thus,  NSDl-knockout 
mice  and  NSD2-knockout  mice  have  different  pheno¬ 
types,  despite  the  fact  that  both  proteins  catalyse  the  for¬ 
mation  of  H3K36me2.  No  knockout- mouse  phenotype 
has  yet  been  reported  for  NSD3. 

Each  NSD  family  member  behaves  as  an  oncogene 
in  multiple  cancers*'^*’"'*'"^.  Translocations  in  NSDl  or 
NSD3  lead  to  the  development  of  AML"’*-"''.  Moreover, 
NSDl,  an  androgen  receptor  co-regulator"*,  has  been 
identified  as  a  candidate  gene  capable  of  discriminating 
between  malignant  and  non-malignant  prostate  tissue"". 
In  addition,  silencing  of  NSDl  has  also  been  shown  to 
cause  sensitivity  to  the  oestrogen  receptor  antagonist 
tamoxifen""*,  but  how  these  two  observations  are  hnked  to 
the  ability  of  NSDl  to  bind  and  regulate  either  the  andro¬ 
gen  or  oestrogen  receptors  is  unknown.  Overexpression 
of  NSD2  has  also  been  implicated  in  multiple  myeloma 
and  additional  cancers,  such  as  neuroblastoma"’*"*,  and 
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Table  2  |  H3K36  methyltransferases  and  their  roles  in  diseases 


Gene 

Diseases 

Molecular  defects 

Phenotypes 

Refs 

NSDl 

Sotos  syndrome 

Haploinsufficiency,  point 
mutations,  deletions, 
translocations 

Macrocephaly,  hypertolerism, 
cognitive  and/or  motor  skill 
deficiencies 

98 

NSDl 

Myelodysplastic  syndrome 

Translocations 

Anaemia,  cytopenia 

111 

NSDl 

Cancers: 

AML,  prostate, 
neuroblastoma,  breast 

Overexpression,  gene 
silencing,  translocations 

Numerous  tumour  types 

19,78,101, 

110,112 

NSD2 

Wolf-Hirschhorn  syndrome 

Deletions 

Learning  difficulties, 
microcephaly,  heart  defects 

32 

NSD2 

Multiple  myeloma 

t(4;14)*  translocation 

Renal  failure,  anaemia,  bone 
lesions 

30,115 

NSD3 

Breast  cancer 

Gene  amplification  at  8pll 

Solid  tumours 

107 

NSD3 

AML 

Translocations 

Leukaemic  cells  in  bone  marrow 

105 

NSD3 

Myelodysplastic  syndrome 

Translocations 

Anaemia,  cytopenia 

106 

SETD2 

Renal  cell  carcinoma 

Deletions,  missense  mutations 

Haematuria,  flank  pain 

99 

The  table  lists  the  known  diseases  resulting  from  alterations  in  the  genes  encoding  enzymes  that  are  specific  for  Lys36  on  histone  H3 
(H3K36).  AML,  acute  myeloid  leukaemia;  N5D,  nuclear  receptor  SET  domain-containing;  5ETD2,  SET  domain-containing  2. 


overexpression  of  NSD3  via  gene  amplification  occurs  in 
about  15%  of  breast  cancers'”  and  correlates  with  poor 
prognosis.  Therefore,  NSD  proteins  are  instrumental  in 
the  development  of  cancer,  but  their  mechanisms  of  action 
in  this  context  are  only  beginning  to  be  elucidated.  One 
possible  mechanism  underlying  NSD-mediated  disease  is 
that  these  proteins  enforce  H3K36  methylation  patterns 
to  change  gene  expression.  This  is  discussed  below  in  the 
context  of  well-known  NSD-dependent  cancers. 

Through  overexpression  or  their  incorporation  into 
fusion  proteins  via  translocations,  NSD  proteins  act  as 
potent  oncoproteins''’-'’*.  For  example,  fusion  proteins 
between  NSDl  and  the  nucleoporin  NUP98  are  well 
established  as  being  causal  for  a  subset  of  AML”-',  and 
they  regulate  homeobox  (Hox)  gene  expression  in  a  mouse 
model  of  AML™.  In  the  case  of  AML,  NSD1-NUP98 
fusions  enforce  an  H3K36  methylation  signal  that  contrib¬ 
utes  to  the  inappropriate  activation  of  Hox  genes  during 
development.  These  increased  H3K36  methylation  lev¬ 
els  seem  to  be  antagonistic  to  the  repressive  methylated 
H3K27  marks  that  normally  silence  Hox  gene  expression, 
leading  to  reduced  H3K27  methylation  levels  and  gene 
activation.  As  a  consequence,  unscheduled  cell  prolif¬ 
eration  and  the  generation  of  AML  are  observed  in  the 
mouse  model™. 

Overexpression  of  NSD2  via  a  translocation  between 
chromosome  4  and  chromosome  14  alters  the  global 
profile  of  H3K36  methylation  in  KMSll  multiple  mye¬ 
loma  cells''’-''’.  Moreover,  the  catalytic  activity  of  NSD2  is 
required  for  tumorigenesis  in  a  mouse  xenograft  model''’. 
And  loss  of  NSD2  function  (through  either  short  hairpin 
RNA-mediated  depletion  or  homologous  recombina¬ 
tion)  globally  decreases  H3K36me2  and  H3K36me3  lev¬ 
els  but  has  no  effect  on  H4K20  or  H3K4  methylation'*. 
It  is,  however,  accompanied  by  a  concomitant  decrease 
in  the  global  levels  of  H3K27me2  and  H3K27me3,  and 
this  downregulation  results  in  the  expression  of  target 
genes  that  would  normally  be  quiescent.  This  suggests 


that  the  overexpression  of  NSD2  alters  transcriptional 
programming  by  promoting  the  formation  of  H3K36me2 
and  thereby  changing  the  chromatin  structure.  Indeed, 
ChIP-seq  analysis  has  precisely  mapped  the  distribution 
ofH3K36me2  in  KMSll  multiple  myeloma  cells  overex¬ 
pressing  NSD2  (REE  30).  In  cells  that  have  one  normal  allele 
of  NSD2,  the  H3K36me2  signal  revealed  by  ChIP-seq 
was  preferentially  enriched  in  intragenic  regions,  which 
is  consistent  with  pre-vious  ChIP-chip  results  in  flies™. 
However,  overexpression  of  NSD2  disperses  an  increased 
H3K36me2  signal  throughout  the  genome,  including 
intergenic  regions.  Cells  expressing  one  normal  allele 
of  NSD2  have  a  modest  H3K36me2  signal  in  promoter 
regions,  followed  by  a  peak  at  transcriptional  start  sites, 
and  then  the  signal  decays  downstream  to  the  3'  end'*,  and 
a  high  signal  corresponds  with  higher  transcriptional  acti¬ 
vation.  By  contrast,  cells  overexpressing  NSD2  have  little 
variance  in  H3K36me2  signal  intensity  across  the  average 
gene  and  the  correlation  between  signal  levels  and  tran¬ 
scription  is  lost.  Instead,  as  some  genes  are  much  more 
sensitive  to  H3K36me2  levels  than  others,  this  aberrant 
enrichment  of  the  H3K36me2  signal  appears  to  trigger 
the  expression  of  quiescent  oncogenes,  including  trans¬ 
forming  growth  factor  alpha  (TGFA),  MET,  p21  activated 
kinase  1  (PAKl)  and  RRAS2  (related  RAS  viral  (r-ras) 
oncogene  homologue)''’. 

H3K36  and  H3K27  methylation  may  be  mutually 
exclusive  modifications^.  Nucleosomes  pre-methylated 
at  H3K27  are  largely  refractory  to  enzymatic  catalysis  at 
H3K36  and  vice  versa.  As  overexpression  of  NSD3  is 
causal  for  the  generation  of  breast  cancer  through  8pl  1-12 
amplifications"’'-’'”,  one  exciting  possibility  is  that  global 
gene  expression  profiles  are  drastically  altered  through 
a  switch  from  H3K27  to  H3K36  methylation.  When 
considering  this  and  the  fact  that  antagonism  between 
H3K36  and  H3K27  methylation  has  also  been  observed 
in  worms™,  one  might  anticipate  that  this  could  be  a  gen¬ 
eral  mechanism  for  the  oncogenic  beha-viour  observed 
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Figure  5  |  NSD  proteins  can  act  as  oncoproteins.  Through  overexpression  and/or 
translocation  events  that  result  in  the  fusion  of  nuclear  receptor  SET  domain-containing 
(NSD)  proteins  with  other  proteins,  such  as  the  nucleoporin  NUP98,  NSD  proteins  can  be 
aberrantly  recruited  to  target  loci  in  various  tissues.  As  a  consequence,  global  levels  of 
dimethylated  Lys36  on  histone  H3  (H3K36me2)  increase  and  are  sufficient  to  activate 
inappropriate  transcription,  which  contributes  to  cancer  development.  In  some  cases, 
the  increased  levels  of  H3K36me2  are  expected  to  inversely  correlate  with  trimethylated 
Fi3K27  (Fi3K27me3;  not  shown),  altering  the  balance  between  competitive  activating 
and  repressive  ‘marks’.  As  a  consequence,  multiple  gene  sets  that  are  sensitive  to  the 
levels  of  H3K36me2  are  turned  on,  a  causal  event  in  oncogenesis.  AML,  acute  myeloid 
leukaemia;  AR,  androgen  receptor;  RNAPl  1,  RNA  polymerase  1 1. 


for  enzymes  that  promote  H3K36  or  H3K27  methyla- 
tion  (FIG.  5).  But  NSD  proteins  are  not  solely  restricted 
to  acting  as  oncoproteins;  they  can  also  act  as  tumour 
suppressors,  as  a  lack  of  NSDl  expression  is  observed  in 
neuroblastomas'®. 


Conclusions 

Although  widely  perceived  as  an  activating  modifica¬ 
tion,  H3K36  methylation  also  functions  in  transcrip¬ 
tional  repression  as  well  as  in  processes  such  as  DNA 
repair.  Thus,  H3K36  methylation  function  extends 
beyond  transcription  and  is  likely  to  be  relevant  in 
a  wide  range  of  DNA-based  processes.  So  far,  much 
effort  has  gone  into  understanding  the  nature  of  the 
enzymes  and  their  substrate  specificities,  but  little  is 
known  about  how  the  enzymes  that  mediate  H3K36 
methylation  are  localized  and  regulated  beyond  the 
well-characterized  model  of  yeast  Set2  and  its  rela¬ 
tionship  with  RNAPII.  Given  that  NSDl  binds  various 
nuclear  receptors,  it  is  clear  that  this  H3K36-specific 
methyltransferase,  and  probably  others,  operates  in 
distinct  tissues  and  perhaps  in  separate  complexes. 
Similarly,  NSD2  interacts  with  different  transcription 
factors  in  mouse  ES  cells  versus  heart  cells  from  embry¬ 
onic  day  12.5  (REF  32).  Thus,  elucidating  the  distinct 
complexes  that  these  enzymes  form  in  different  cell 
types  and  at  different  stages  of  development  will  be 
important.  As  mammalian  cells  possess  multiple,  non- 
redundant  enzymes  that  methylate  H3K36  to  varying 
degrees,  the  roles  of  this  modification  are  expected 
to  be  more  complex  than  in  yeast.  Despite  this,  it  is 
apparent  that  a  role  for  methylated  H3K36  in  coordi¬ 
nating  crosstalk  between  acetylation  and  methylation 
is  conserved  between  flies  and  higher  eukaryotes"'’. 
This  crosstalk  results  in  a  given  H3K36  modification 
that  is  interpreted  by  the  reader  protein  in  the  context 
of  neighbouring  histone  modifications,  potentially 
changing  its  meaning.  Undoubtedly,  this  ‘combinato¬ 
rial  histone  language’  will  be  the  subject  of  future  stud¬ 
ies  of  methylated  H3K36  that  will  expand  its  scope  of 
importance  to  the  chromatin  landscape. 
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